Entropy, Thermostats and 
Chaotic Hypothesis 

Giovanni Gallavotti 

Fisica and I. N.F.N. Roma 1 
February 6, 2008 

Abstract: The chaotic hypothesis is proposed as a basis 
for a general theory of nonequilibrium stationary states. 

1. Stationary states and thermostats. 

The problem is to develop methods to establish re- 
lations between time averages of a few observables as- 
sociated with a system of particles subject to work- 
performing external forces and to thermostat-forces that 
keep the energy from building up, so that it can be con- 
sidered in a stationary state. 

The stationary state will correspond to a probability 
distribution on phase space T so that 
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for all x but a set of zero volume: the first refers to cases 
in which dynamics is a map S : T — > T and the second 
when it is a flow defined by a differential equation on T: 



x = f E (x) 



(1.2) 



where fg contains internal forces, external forces depend- 
ing on a few parameters E = (Ei,...,E n ), and ther- 
mostats forces. In general the divergence 



(1.3) 



is not zero, except in absence of external forces E and of 
thermostat forces (i.e. in the equilibrium case). 
A fairly realistic example is the following: 



Ti 
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Fig.l "Thermostats", or reservoirs, occupy finite regions outside 
Cg, e.g. sectors C' a C R , a = 1, 2 . . ., marked T a located beyond 
"buffers" C a '. the buffers (representing a the walls separating the 
system from the thermostats) simply have their boundaries marked. 
The reservoir particles are constrained to have a total kinetic energy 
K a constant, by suitable forces so that their "temperatures" 



T a , see (1.5), are well defined, Q. 
arbitrary sizes. 



Buffers and reservoirs have 



The system contains Nq particles in a configuration 
Xo contained in Co and Ni, N[ particles in configurations 
that will be denoted X^,X^ contained in the buffer re- 
gions Cj, henceforth called wall, and in the thermostat 
regions C[, i = 1,. .. ,n, respectively. The equations of 
motion are, for i — and i > respectively, 



X = 

% = 
X' = 



-dxo([/o(Xo) + > n ^ 0l (X , X,)) + E(X ) 

-fl!x 4 (Ui(X j ) + W M (Xo,X j ) + Wi )i /(X <> X i 0) 
-d X ; (C^(XJ) + W M / (Xi,Xj)) - a, X< (1.4) 



where Ui, U[ are the interaction energies for the particles 
in Ci, i = 0, 1, . . . ,n and in C[, i — 1, . . . , n; E(Xo) is the 
external force working on the system in Co and — a^XJ 
is the thermostat force: which is the force prescribed by 
Gauss' principle of least effort, see Appendix A9.4 in 0], 
to impose the contraints (ks = Boltzmann's constant) 



ix* = 



(1.5) 



which gives, after a simple application of the principle, 



a, 



N' k B Ti 



(1.6) 



where Li is the work done per unit time by the particles 
Xj G Ci on those in X^ G C-, i.e. on the thermostats. 

Other thermostat models could be considered: how- 
ever their particular structure should not influence the 
statistical properties of the particles in Co- In particular 
I think that replacing the container C'i with an infinite 
container in which particles are initially in a state that 
is an equilibrium Gibbs state at temperature Ti should 
lead to the same results: this is a conjecture whose proof 
seems quite far at the moment. 

In the following we shall regard the equations (1.4) 
as first order equations on the phase space coordinates 
x = {Xi,Xi}" =0 . As such the equations do not conserve 
volume of phase space: in fact the divergence of the equa- 
tions in this space is — <j[x) with 



. . ^ Li dm -I ^ m d/Vj-1 
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k B T, dN' 
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(1.7) 



where $ = — J2i>o k^T- » as ^ can ^ e cnec ked by direct 
computation. 
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Since U = -X[ ■ dx'Ww = +X % ■ d Xi W itV - W it i> and 
the expression (1.7) is the sum over i > of — ^ [h^i + 

Uij — XidxiWi.o which has the form ^>i + Qi where Qi is 

the work per unit time done by the forces due to particles 
in Co on the particles in Ci : we identify therefore Qi with 
the heat generated per unit time by the forces acting on 
Co and transfered first to the walls Ci and, subsequently, 
to the thermostats in C[. 

Thus setting e(x) = J2i>o J^t: ^ ^ s (^ or n °tational 
simplicity, and keeping in mind that N- should be 
thought as large, we shall neglect 0(N-^ 1 )) 



a(x) — e(x) + R 
where R(x) = - E< fcB T, 



(1.8) 



Remark: (1) In this model, as well as in a large num- 
ber of others, one has therefore the natural interpreta- 
tion of o~(x) as the entropy creation per unit time: this 
is because for large time the average of the l.h.s., cr(x), 
over a time interval and the corresponding average of 
£ { x ) = J2i>o k^T- become equal at large time because 
they differ by - (R(S t x) — R(x)), at least if R is bounded, 
as it is convenient to suppose for simplicity. This is a 
strong assumption but it will not be discussed here: it 
has to do with the problem of thermostats "efficiency" 
and its violation may lead to interesting consequences, 
see 0,0. 

(2) It should be noted that the walls Ci could be missing 
and the particles in Co be directly in contact with the 
thermostats: in this case there will be no Wiy but in- 
stead there will be potentials Woy : the analysis would be 

entirely analogous with replaced by -^4^ with Q\, 
being the work per unit time done by the particles in Co 

on the thermostat particles in C[ and R = — — j^-j— i ■ 
In this case if the porentials of interactions are bounded 
the R will be also bounded without any extra assump- 
tion. 

(3) The Li in Eq.(1.7) is the work ceded by the walls to 
thermostats: therefore it can be interpreted as the heat 
Q\ ceded by the paricles in Ci to the thermostat in C-: 
hence the alternatice representation cr{x) = e'(x) + $, 

(1.7), is possible with e'(x) — J2i>o T^F- ^lso m tn is 
case the remainder $ is bounded if the interaction poten- 
tials are bounded and the discussion that follows applies 
to both e(x) and e'(x), which are thus equivalent for the 
purpose of fluctuation analysis. 

2. The hypothesis. 

Chaotic Hypothesis: Motions developing on the at- 
tracting set of a chaotic system can be regarded as a tran- 
sitive hyperbolic system. 

A general result is that transitive hyperbolic systems 



have the property (1.1), with fi a uniquely determined 
probability distribution on phase space, 0. 

Of course a flow can be studied via a Poincare map S 
defined by a timing event E. The latter is defined by a 
surface in phase space which is crossed by all trajecto- 
ries infinitely many times (typically E is the union of a 
few connected surface elements E = UjE^, but in general 
it is not connected: i.e. it is a finite collection of con- 
nected pieces). The timing event occurs when a trajec- 
tory crosses E at a point x and time to : and S maps it into 
the next timing event Sx occurring, at some time ti, on 

def 

the trajectory t — > Stx: hence x' = Sx = S^-tgX E E. 

For model (1.4) there is a direct relation between a(x), 
x G E, and the Jacobian determinant detd x S(x); setting 
R(h) = R(S tl - to x) 7 R(t ) = R(x), it is 



logldeta^x)! = / <r(S t x)dt 

J t 

ftl 

e(S t x) dt + R(h) - R(t ) = 
't 

^ ft 1 Qi dt 



(2.1) 



i>0 



The theory of evolutions described by flows or described 
by maps are therefore very closely related as the above 
remarks show, at least for what concerns the analysis of 
the entropy creation rate and its fluctuations. 

The second viewpoint should be taken whenever o~(x) 
has singularities: which can happen if the interaction 
potentials are unbounded (e.g. of Lennard- Jones type) 
or if the thermostats sizes tend to infinity, see 0. 

3. Dimensionless entropy and fluctuation theorem. 

Interesting properties to study are related to the fluc- 
tuations of entropy creation averages. Restricting the 
analysis to the model (1.4), define the entropy creation 
rate to be 



lira 



a(Stx) dt = lim 



e{S t x)dt (3.1) 



by the remark at the end of Sec.l. 

Assuming that the system is dissipative, which by def- 
inition will mean £ + > 0, consider the random variable 



pd gl f T e{Stxl dt 
t Jo £+ 



(3.2) 



that will be called the dimensionless phase space contrac- 
tion and considered with the distribution inherited from 
the SRB-distribution /i of the system. 

A general property of random variables of the form a = 
i J T F(S t x) dt, which are time averages over a time r of 
a smooth observable F, is that, if motions are transitive 
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and hyperbolic, the SRB-probability distribution [i that 
a is in a closed interval A has the form 

P M (a G A) = cxp(r maxCF(a) + 0(1)) (3.3) 

aEA 

for A C (a_, a+), where a± are two suitable values within 
which the function Cf(^) is defined, analytic and convex; 
the fluctuation interval [a_,a + ] contains the /i-average 
value of F and if A n [a_ , a + ] = the probability P M (a G 
A) tends to as t — ► oo faster than exponentially. For 
this reason the function Cf(i) can t> e naturally defined 
also for a G" [a~, a + ] by giving it the value — oo, |j,|a,l2ll3- 
Finally 0(1) means a quantity which is bounded as r — > 
oo at A fixed. 

The function Cf(o) is called the large deviations rate 
for the fluctuations of the observable F. 

If the motions are also reversible, i.e. if there is 
an isometry I of phase space such that I St — S-tl 
or IS = S^ 1 /, in the case of time evolution maps, 
any observable F which is odd under time reversal, i.e. 
F{Ix) = —F(x) will have a fluctuation interval [—a*, a*] 
symmetric around the origin (and containing the SRB- 
average a of F). 

In the case of the model (1.4) time reversibility cor- 
responds to the velocity inversion and the evolution is 
reversible in the just defined sense. The fluctuation in- 
terval of a(x)/s+ and of s(x)/e+ is therefore symmetric 
around the origin and p* > 1 because the averages of the 
two observables are 1 by definition, see (3.1), (3. 2). 

A general theorem that holds for transitive, hyperbolic 
motions is the following 

Fluctuation theorem: Given a hyperbolic, transitive 
and reversible system assume that the SRB average u + of 
the phase space contraction o~{x), i.e. that the divergence 
of the equations of motion (1.3), is a+ > 0. Consider 
the dimensionless phase space contraction o~(x)/a+: this 
is an observable which has a large deviations rate £(p) 
defined in a symmetric interval (—p*,p*) and satisfying 
there 

C(-P) = C(P)-JW+ (3-4) 



it should nevertheless apply to many interesting cases. 

(iii) In particular it should apply to the model (1.4): ac- 
tually in this case it has already been remarked that the 
observable a(x)/a + and the dimensionless entropy cre- 
ation rate e(x)/e+ have the same large deviations func- 
tion; hence (3.4) should hold for the rate function of 

P = ~ Jo Sn>0 k B TaC + 

C(-P) = C(P)-|*+, peh'.P*) (3-5) 

(iv) The latter remark is interesting because the quantity 

e(x) = X)a>o k^T ^ as a physical meaning and can be 
measured in experiments like the one described in Fig.l 
or in experiments for which there is not an obvious equa- 
tion of motion (i.e. no obvious model). 

(v) Therefore in applications the relation (3.5) is ex- 
pected to hold quite generally and, in the general cases, 
it is called fluctuation relation, abridged FR, to distin- 
guish it from the Fluctuation Theorem. 

(vi) Furthermore the quantity e(x) is a local quantity as 
it depends only on the microscopic configurations of the 
system Cq and of the walls Ci in the immediate vicinity 
of their separating boundary. In particular the relation 
(3.5) does not depend on what happens in the bulk of 
the walls Ci or on the size of the thermostats C[: hence 
the latter can be taken to infinity. One can also imagine 
that (3.5) remains valid in the case of infinite thermostats 
whose particles are initially distributed so that their em- 
prical distribution is asymptotically a Gibbs state at tem- 
perature T a . 

(vii) The last few comments suggest quite a few tests 
of the chaotic hypothesis and of the corresponding fluc- 
tuation relation in various cases, see for instance |ll| . 
Therefore the fluctuation relation, first suggested by the 
simulation in , where it has been discovered in an ex- 
periment motivated to test ideas emerging from the SRB 
theory, and subsequently proved as a theorem for Anosov 
systems in 0,0], gave rise to the chaotic hypothesis and 
at the moment experiments are being designed to test its 
predictions. 

(viii) The theorem will be referred as FT. It is often writ- 
ten in the form, see (3. 3), (3. 4), 



Remarks: (i) The (3.4) can be regarded as valid for all 
p's if we follow the mentioned convention of defining 

C(p) = -oo for P & [~P*,P*]- 

(ii) By the chaotic hypothesis, abridged CH, it follows 
that a relation like (3.4) should hold for the SRB distri- 
bution of the dimensionless phase space contraction of 
any reversible chaotic motion with a dense attractor or, 
more generally, for dimensionless phase space contrac- 
tion of the motions restricted to the attracting set, if a 
time reversal symmetry holds on the motions restricted 
to the attracting set, [H,0]. Of course this is not a the- 
orem (mainly because hyperbolicity is a hypothesis) but 



1 Pfj.jp £ A) 

nm — log — — — - = a i maxp 

t^oo t Pfj_(p £ A) peA 



(3.6) 



for A C (— p*,p*) or in the more suggestive, although 
slightly imprecise, form: 



l im I lo P » 



pa+ 



(3.7) 



which can be regarded valid for p G ( — p* , p*) . 

(ix) It is natural to think that the special way in which 

the thermostats are implemented is not important as long 
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as the notion of temperature of the thermostats is clearly 
understood. For instance an alternative thermostat could 
be a stochastic one with particles bouncing off the walls 
with a Maxwellian velovity distribution at temperature 
depending on the wall hit. In this context the experiment 
in |l5j appears to give an interesting confirmation. 

4. Extending Onsager-Machlup's fluctuations 
theory 

A remarkable theory on noncquilibrium fluctuations 
has been started by Onsager and Machlup, 0,0], and 
concerns fluctuations near equilibrium and, in fact, it 
only deals with properties of derivatives with respect to 
the external forces parameters E evaluated at E = 0. 

The object of the analysis are fluctuation patterns: the 
question is which is the probability that the successive 
values of F(S t x) follow, for t 6 [— t,t], a preassigned 
sequence of values, that I call pattern ip(t), flS| - 

In a reversible hyperbolic and transitive system con- 
sider n observables Fi , . . . , F n which have a well defined 
parity under time reversal Fj(Ix) — ±Fj(x). Given n 
functions ifj(t), j = 1, . .. ,n, defined for t € [—5, §] the 
question is: which is the probability that Fj (S t x) ~ <pj (t) 
for t G [- J, §]? the following FPT theorem gives an an- 
swer: 

Fluctuation Patterns Theorem: Under the assump- 
tions of the fluctuation theorem given Fj , tpj , and given 
e > and an interval A C (— p* ,p*) the joint probability 
with respect to the SRB distribution 



P^\Fj(Stx) - ipj(t)\ j=u ..., n < e,p e A) 



Pp(\F j (S t x)TV>j(-t)\j=i,...,n < fV 
= exp(rmax pa + + 0(1)) 



-p S A) 



(4.1) 



where the sign choice =p is opposite to the parity of Fj 
andp= f ^fK^^dt. 

f rJ-5 a + 

Remarks: (i) The FPT theorem means that "all that has 
to be done to change the time arrow is to change the sign 
of the entropy production" , i. e. the time reversed pro- 
cesses occur with equal likelyhood as the direct processes 
if conditioned to the opposite entropy creation. This is 
made clearer by rewriting the above equation in terms of 
probabilities conditioned on a preassigned value of p; in 
fact up to it becomes, @], for \p\ < p*: 



P*(\Fj(S t x) - <pj(t)\ j= i,..., n < e, \ p) 



P ll QF j {S t x)T<P j (-t) 



< 



P) 



1 (4.2) 



(ii) An immediate consequence is that defining fi the av- 
erages fi = \ J_ 2 X Fj(Stx) then the SRB probability that 
fx , . . . , /„ occur in presence of an entropy creation rate p 
is related to the occurrence of T/i, ■ ■ • , Tin m presence 



of the opposite entropy creation rate: in a slightly impre- 
cise form, see remark (viii) in Sec. 3 and (3.7), this means 
that 



km llog Wi- ■■./".*>) 

r^oor P M (=F/l,---,T/n,-p) 



(4.3) 



(iii) In particular if Fj are odd under time reversal and 
p can be expressed as an (obviously odd) function of 
fu...,f„: p = 7r(/i, . . . ,f n ) the (4.3) can be written, 

m 



l im I log. Wi' •••'/») 



T P^(-fl,---,-fn 



n(f 1 ,...,f n )a+ (4.4) 



for 7r(/i, . . . , /„) G (—p*,p*): a particular case of this 
relation is relevant for Kraichnan's theory of turbulence, 

E3 

(iv) An interesting application, [2fJ, l2lj . of (4. 3) with 
jj(x) = 9£ 3 CT(a;) is that, setting Jj = fi(jj) = (ji) ^ it 



Ljk — dE k Jj\E=o — Lkj 



(4.5) 



Since in several interesting cases Jj have the interpreta- 
tion of "thermodynamic currents" (i.e. currents divided 
by k B T if T is the temperature) generated by the "ther- 
modynamic forces" Ej the (4.5) have the interpretation 
of Onsager reciprocal relations. In fact also the expres- 



sions 



L 



Ok 



fjL{<Tk(StX)<Jj(x))E=odt (4.6) 



follow from FPT and have the interpretation of Green- 
Kubo formulae. The above relations have been derived 
under the extra simplification that a = for E = 
which is satisfied in several cases, see comment following 
Eq.(3.5) in [21j. However what is really necessary is that 
(cr) E=0 = 0, which is an even weaker assumption because 
the analysis in 2(1 is, verbatim, unchanged if instead of 
(7 = for E = one has (cr) E=0 = 0. 

(v) The assumption of reversibility at E / 0, which is 
necessary for the FPT, is not really necessary to derive 
(4.6) (hence (4.5)) as shown in [z| where such relations 
are derived under the only assumption that for just E = 
the motions is reversible. 

(vi) A further application of FPT is its relation with the 
theory of intermittency, see [23L |24| . 

(vii) The above analysis and the arbitrariness of the walls 
Ci hints that even if the thermostating mechanism is quite 
different, for instance it is generated by viscous forces 
— i/jXj hence not reversible, nevertheless the quantity 
e(x) will satisfy a FR. 

(viii) In any event it appears that the total phase space 
divergence cr(x) is not directly physically relevant and 
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in fact it is not physically meaninglful. Since it differs 
from the physically measurable entropy production e{x) 
by a total derivative it can only be used to infer prop- 
erties of the latter, as done in the FR: of course a FR 
will hold also for s(x) in the reversible cases. However 
given the possibly very large (arbitrarily large) size of the 
contributions to o~(x) due to the total derivative R(x) to 
(2.1), or (1.8), the time scale for the large fluctuations of 
p' = i j Q T a (^tx) eag jjy b ecomes unobservably large while 

the time scale for the fluctuations of s{x) remains inde- 
pendent on the size of the walls Ci and of the thermostats 

5. JF, BF and fluctuation relations 

An immediate consequence of FT is that 



Jo ) S RB = e 



- p O(l) 



(5.1) 



- r s(S t x) dt, 
= Jo ) 



i.e. (e Jo )srb s t a y s bounded as r — > oo. This 

is a relation that I will call Bonetto's formula and de- 
note it BF, see Eq.(9.10.4) in ,2j; it can be also written, 
somewhat imprecisely and for mnemonic purposes, [25|. 



SRB 



1 



(5.2) 



which would be exact if the FT in the form (3.7) held for 
finite r (rather than in the limit as r — > oo). 

This relation bears resemblance to Jarzinsky 's formula, 
henceforth JF, which deals with a canonical Gibbs distri- 
bution (in a finite volume) corresponding to a Hamilto- 
nian H {p, q) and temperatute T — (fcs/?) -1 , and with a 
time dependent family of Hamiltonians H(p,q,t) which 
interpolates between H and a second Hamiltonian Hi 
as t grows from to 1 (in suitable units) which is called 
a protocol. 

Imagine to extract samples (p, q) with a canonical 
probability distribution fio(dpdq) — Zq 1 e~ 'f 3H °(P'i'> dpdq, 
with Zo being the canonical partition function, and let 
So,t(p,q) be the solution of the Hamiltonian time de- 
pendent equations p = —d„H(p, q,t),q = d p H(p, q, t) for 
< t < 1. Then JF, HH23, gives: 

Let (p',q') d = S ,i(p,q) and let W{p' , q') d = H x {p' ,q') - 
Ho(p,q), then the distribution Z^ 1 e~ l3Hl ( p ,q ^dp'dq' is 



exactly equal to 



Zo p -0W(p',q') 



pLa{dpdq). Hence 



-pw 



Zi 



y =Z± = e -/3A-F(/3) 



(5.3) 



where the average is with respect to the Gibbs distribution 
jio o,nd AF is the free energy variation between the equi- 
librium states with Hamiltonians Hi and Hq respectively. 

Remark: (i) The reader will recognize in this exact iden- 
tity an instance of the Monte Carlo method. Its interest 



lies in the fact that it can be implemented without actu- 
ally knowing neither Hq nor Hi nor the protocol H(p, q, t). 
If one wants to evaluate the difference in free energy bew- 
teen two equilibrium states at the same temperature of a 
system that one can construct in a laboratory then "all 
one has to do" is 

(a) to fix a protocol, i.e. a procedure to transform the 
forces acting on the system along a well defined fixed once 
and for all path from the initial values to the final values 
in a fixed time interval (t = 1 in some units), and 

(b) measure the energy variation W generated by the 
machines implementing the protocol. This is a really 
measurable quantity at least in the cases in which W can 
be interpreted as the work done on the system, or related 
to it. 

Then average of the exponential of —f3W with respect 
to a large number of repetition of the protocol. This can 
be useful even, and perhaps mainly, in biological experi- 
ments. 

(ii) If the "protocol" conserves energy (like a Joule expan- 
sion of a gas) or if the difference W = Hi (p' , q') — H (p, q) 
has zero average in the equilibrium state /io we get, by 
Jensen's inequality {i.e. by the convexity of the expo- 
nential function {e A ) > e^), that AF < as expected 
from Thermodynamics. 

(iii) The measurability of W is a difficult question, to 
be discussed on a case by case basis. It is often possi- 
ble to identify it with the "work done by the machines 
implementing the protocol" . 

The two formulae (5.2) and (5.3) are however quite 
different: 

(1) the J Q T cr(Stx) dt is an entropy creation rather than 
the energy variation W. 

(2) the average is over the SRB distribution of a sta- 
tionary state, in general out of equilibrium, rather than 
on a canonical equilibrium state. 

(3) the BF says that (e E ( s ' tX ) dt ) SRB is bounded, 
(5.1), as t — ► oo rather than being 1 exactly. However a 
careful analysis of the meaning of W would lead to con- 
cluded that also JF necessitates corrections, particularly 
in thermostatted systems, [27j . 

The JF has proved useful in various equilibrium prob- 
lems (to evaluate the free energy variation when an equi- 
librium state with Hamiltonian Hq is compared to one 
with Hamiltonian Hi); hence it has some interest to in- 
vestigate whether (5.2) can have some consequences. 

If a system is in a steady state and produces entropy at 
rate e+ {e.g. a living organism feeding on a background) 
the FT (3.4) and is consequence BF, (5.2), gives us in- 
formations on the the fluctuations of entropy production, 
i.e. of heat produced, and (5.2) could be useful, for in- 
stance, to check that all relevant heat transfers have been 
properly taken into account. 
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